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Abstract — Consider a sensor network made of remote nodes 
connected to a common fusion center. In a recent work Blum and 
Sadler |T| propose the idea of ordered transmissions — sensors 
with more informative samples deliver their messages first — and 
prove that optimal detection performance can be achieved using 
only a subset of the total messages. Taking to one extreme this 
approach, we show that just a single delivering allows making 
the detection errors as small as desired, for a sufficiently large 
network size: a one-bit detection scheme can be asymptotically 
consistent. 

The transmission ordering is based on the modulus of some 
local statistic (MO system). We derive analytical results proving 
the asymptotic consistency and, for the particular case that the 
local statistic is the log-likelihood (£-MO system), we also obtain 
a bound on the error convergence rate. All the theorems are 
proved under the general setup of random number of sensors. 
Computer experiments corroborate the analysis and address 
typical examples of applications including: non-homogeneous 
Poisson-deployed networks, detection by per-sensor censoring, 
monitoring of energy-constrained phenomenon. 



I. Introduction 

FOLLOWING a general trend in the area of signal process- 
ing (see e.g., @, 0), in the last few decades there has 
been a considerable interest in distributed detection systems 
where a multitude of small sensors, properly networked to 
operate as a whole, takes the place of a single complex device 
typical of the classical system architecture. 

There are several advantages that the distributed schemes 
boast about, in comparison with their centralized counterpart. 
These include robustness, scalability, flexibility, portability, 
failure resilience, and so forth. But the advent of distributed 
systems also poses new challenges to the signal processing 
community, since the detection layer is interleaved with the 
communication one so that new design trade-offs arise, yield- 
ing novel design guidelines and approaches fl4] — J21 - 

One aspect that is of primary relevance in the implementa- 
tion of many distributed detection systems is the limitation 
of the sensors' energy, with consequence on the sensors' 
capability of sensing, processing and delivering data. While 
a precise evaluation of the relative impact of these tasks 
strongly depends upon the specific network, for many wireless 
systems the task of communication is by far the more energy 
consuming [8], 0. As a consequence, one important issue in a 
wireless sensor network (WSN) is how to design the system in 
order to reduce the amount of communication, given a desired 
level of detection performance. 

The authors are with the Department of Electronic and Computer Engineer- 
ing (DIEII), University of Salerno, via Ponte don Melillo 1-84084, Fisciano 
(SA), Italy. E-mails: {pbraca, marano, vmatta}@unisa.it. 



As regards to this aspect, in the literature several approaches 
have been proposed. They include the design of energy effi- 
cient routing in ad-hoc networks |4|. the implementation of 
proper strategies for the access to the common communication 
medium |4|, and the use of censoring pioneered in [ 1 1 and 
further developed in many successive works, see e.g., ifPTl - 
fT3l . Censoring refers to the idea of quantifying, at the sensor 
level, the informativeness of the sensed samples before sharing 
the observations with other nodes or with a sink unit: only the 
data that are believed informative enough are shared, while 
the node remains silent otherwise, optimizing the battery life. 

In this work we consider a modification of this idea: we 
design a distributed detection system for binary hypothesis 
test in which the remote nodes quantify the informativeness 
(for detection) of their observations, and communicate their 
local decisions to the system, with more informative samples 
that are communicated first. As soon as one local decision 
is communicated, that is taken as the global decision of the 
network and the detection task terminates. Thus, censoring is 
obtained on a time-selective basis. 

Note that two different stages exists: the sensing stage in 
which all the nodes observe the state of the nature to be 
decided over, and the successive data fusion stage in which 
each node is ready to deliver its local decision after a time 
interval that is inversely proportional to the informativeness 
of the observations. As the first (more informative) local 
decision is sent, the whole detection task is terminated with 
final decision equal to the local quickest decision. To make a 
decision, our scheme prescribes just one communication event. 

A WSN can be organized according to many different 
architectures. One possibility is that a common fusion center 
(FC) exists with the role of collecting the data that the remote 
nodes deliver, usually after some local pre-processing. In this 
case, either dedicate links connecting each node to the FC 
exist, or there is a common channel that the sensors access by 
some suitable multiple access scheme. 

Another common architecture lacks any central unit, and the 
local decisions are propagated within the network by inter- 
sensor communication protocols, while collaborative signal 
processing procedures are employed to mimic the presence 
of a FC. In these "fully flat" WSNs, the final decision is taken 
by the network in a distributed fashion and is usually shared 
by all the nodes, as in the consensus schemes [14|-|18|. 

For most part of this paper, we do not refer to any specific 
architecture since our goal is to investigate the detection 
performance of the statistics computed at the faster firing node 
of the network, which is largely independent of the specific 
WSN architecture. For concreteness, one can imagine that, 
if a FC exists, after receiving the first delivering from some 
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sensor, such FC broadcasts a stopping message to all the 
other nodes. Conversely, in fully flat architectures, the halting 
command should be propagated, along with the decision, 
from the deciding node to all the other nodes by means of 
some suitable multi-hop protocol or by means of consensus 
algorithms. We refer to our scheme as a one-bit distributed 
detection; it should be noted, however, that we are disregarding 
the network messages required for the halting procedure. 

A. Related work & motivations 

The general approach of quickly computing an efficient 
detection statistic resembles the quickest detection problem 
originally introduced by |[T9l ; see Il20l - ll22l for more recent 
references. The substantial difference is that quickest detection 
procedures are usually used to monitor in a continuous way an 
underlying phenomenon to discover a change in the statistics 
of the observed process. Conversely, in the system we design, 
the sensing stage and the detection stage are separated. First, 
the environment is sensed by all the sensors, and then a 
detection step is initiated. The quickness is only a property 
of this second stage, it is not related to event detections but 
it is a mean to save system energy, and there is no change in 
the process statistics. 

More relevant to our setup is the work by Blum and 
Sadler [ 1|. They study a WSN engaged in a detection problem 
and conceive a multiple access architecture in which each 
sensor accesses the channel after a delay inversely proportional 
to the informativeness of its measurement. They show that, 
upon receiving at the FC a certain fraction of the overall 
sensor measurements, a decision can be made at the same 
performance level achievable by using the complete set of data 
collected by the sensors. The channel access rule, indeed, is 
such that the more informative samples are delivered first so 
that transmissions can be saved, without degradation in error 
probability. 

In applications where sensors are severely battery-limited 
and relatively tiny and cheap, one can lead this approach to 
one extreme. What if, rather than collecting at the FC a certain 
fraction of the sensors' deliverings, just one single sample 
is considered? This would imply a significant energy saving, 
payed in the coin of a performance degradation with respect 
to the approach of [1|. However, given that sensors are tiny 
and cheap, performances can be improved by increasing the 
number of sensors, if some form of asymptotic consistency 
holds. This work elaborates on this concept. 

B. Main results & organization 

One main theoretical result of this paper is the proof of the 
asymptotic consistency of the described one-bit system when 
the informativeness of sensors' samples is evaluated according 
to the modulus |T(-)| of some suitable transformations T(-) of 
the observed samples. For instance, in the case that T(-) is the 
identity, the idea is that extreme values of the measurements 
carry more information for detection with respect to "near the 
mean" observations. 

We also consider as index of informativeness the modulus of 
the log-likelihood of the observed sample, which is motivated 



by known results on censoring [10|. When remote sensors 
compute the log-likelihoods, and the delivering time is ac- 
cordingly set, beside proving the asymptotic consistency of the 
test we derive bounds to the asymptotic rate of convergence of 
the error probabilities. We consider also networks whose size 
is random and possibly depends upon the observed data. This 
allows to consider very general applicative scenarios, examples 
of which are given in Sect. |IV] 

The remainder of this paper is organized as follows. The 
problem statement is described in Sect. [II] the main results are 
presented in Sect. [TTT1 examples of applications are provided 
in Sect. IIV1 while in Sect. [V] we summarize. 

II. Problem statement 

A. Preliminaries 

Consider a WSN made of n remote units that sense the 
surrounding environment to decide which of two mutually 
exclusive states of the nature, Hq or Hi, is actually in force. 
The observation made by sensor i is modeled as a random 
variable JQ, where i = 1, 2, . . . , n, and the Xi's are indepen- 
dent and identically distributed (iid) samples drawn from one 
of the two possible marginal probability density functions (or 
pdf's) fx(x; Hj), j = 0, 1. This simple hypothesis test can be 
schematically formalized as 

H a : Xi ~ f x (x;Ho), vs. H x : ^ ~ fx{x;Hx). (1) 

We assume throughout this work that the involved random 
variables, taking values in 5R, admit densities and these densi- 
ties have unbounded support, in the sense that s\xp x {fx{x) > 
0} = oo and inf x {fx{x) > 0} = — oo. 

Suppose that sensor i computes a suitable local detection 
statistic T(Xi), to be compared with a certain threshold valueQ. 
If the threshold is crossed a local decision in favor of Hi 
is made, while the local decision is for Hq otherwise; let 
Di = 0,1 be such decision. As proposed in Q~), sensor i 
is programmed to communicate with the network after a time 
interval proportional to l/|TpQ)|, however in this work it is 
supposed that sensor i delivers his local decision Di instead 
of the local statistic T(Xi) as in 0]. We assume that the 
sensors are perfectly synchronized so that they share the same 
time reference. Then, the larger is |T(Xi)|, the faster is the 
delivering of Di and, different from JT], in our scheme the 
"winner takes all". Otherwise stated, as soon as the quickest 
sensor delivers its own decision (say, the sensor "fires"), such 
decision is immediately taken as the final one for the whole 
system, all other transmissions by the remaining n — 1 sensors 
are instantaneously inhibited, and the overall detection process 
is terminated. 

While many other forms of ordering are certainly con- 
ceivable, the choice of modulus ordering leads to analytical 
tractability and has a precise rationale, as detailed later. Two 
obvious choices for the transformation T(-) are the identity 
T(x) = x, and that based on the log-likelihood ratio 

fx{x;Hi) 



T(x) = L(x):=h£ 



fx{x;Ho) ' 



'Needless to say, a likelihood ratio test would be the best. However, this 
might not be available, e.g., in fully or partially nonparametric setups, such 
that it is of interest to study general detection statistics, see also |23 1. 
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In the former case the firing time of the generic sensor i 
is proportional to in the latter it is proportional to 

V|£pQ)|. 

The idea of accessing the channel by ordering is borrowed 
by 01, and the main aim of this paper is to investigate the 
asymptotic properties of the above distributed detector, with 
respect to the number n of sensors. A number of simplifying 
assumptions are made, including the possibility of instanta- 
neously communicate the first local decision to sleep down 
the system, and the assumption of perfect time synchronism 
among sensors. While we use this setup to get clean analytical 
results and useful insights, some of the effects related to time 
errors and uncertainty are briefly investigated in Sect. IIV-EI 
exhibiting a certain robustness of the proposed strategy. 

B. Detector design 

First, let us specify the local testing rule of ([TJ at sensor i 



Ho 



(2) 



where 7„ is the detection threshold (which is allowed to 
depend on n), and the local decision Di is accordingly defined. 

The transmission policies considered in this work is defined 
as follows. 

DEFINITION 1 (Transmission policy) The transmission of the 
local decision Di made by the generic sensor i is activated 
at a time inversely proportional to the absolute value of its 
transformed measurement |T(Xj) |; we call this policy MO 
(modulus ordered). Within the class of MO, ifT(x) — L{x) 
the system is called t-MO (log-likelihood modulus ordered), o 
Thus, the transmission policy is identified by the transforma- 
tion T( ), leading to the definition of the random variable 
Zi := T(Xi), with cumulative distribution function (cdf) 
Fz (x;Hj) and pdf fz(x;Hj), under hypothesis Hj with 
j = 0,1. The modulus ordering can be defined in terms of 
the index permutation tt(-) defined by the property that 



Mi) 



< Z. 



7T(2) 



< 



• < z. 



7r(n) | 



(3) 



and the decision statistic of our system is Ai n := Z n r n y 

Therefore, the decision rule of the test (Q3 for the whole 
network is: 

Hi 

M n J 7n . (4) 

Hq 

Next, consider the following extreme value statistics (k is 
a positive integer) 



min Zk = max (—Z^) , 

k<n k<n 



Mn'-=ma,xZ k , M n : 

whence the decision statistic in (0]l can be expressed as 

.. f M+ if M+>M- 
M n = KA n - -, f K A „ • (5) 



Also, let us denote by FM n {x;Hj), F M + (x;Hj) and 



F M - (x;Hj) the cdf's of the above quantities under hypoth- 



the corresponding pdf's. Standard results of order statistics 
theory allows us to compute these functions as follows 11241 : 



f M + ( X M =n p z~ 1 {x;Hj)fz {x;Ui) 



(6) 



f M - Hj) = n (1 - F z {-x- Hj)) n f z {-x; Hj) . (7) 

As regard to the statistical distribution of A4 n , exploiting 
the results provided in ||251 . ll26l we have 



fM n (x;H j ) = nh j - 1 (x)f z (x;n j ) 



where 



hjix) = F z (\x\;H 3 ) - FzHxl;^). 



(8) 



(9) 



III. Asymptotic analysis 

We are now ready to introduce the considered asymptotic 
setup for order statistics. We are primarily interested in the 
regime of large number of sensors, that is, n — > oo. However, 
in many WSN applications, the number of effective sensors 
that contribute to the final inference is uncertain, due to several 
practical issues, such as failures, time-varying topologies, 
impaired communication, compromised nodes, and so on. 
Accordingly in this paper we consider the more general case of 
a random number of sensors; see, e.g., [27], for a discussion on 
the relevance of this scenario in distributed detection problems. 

Formally, let N be the random number of sensors, whose 
distribution depends on an integer parametej^ v. Depending on 
the application, v may represent the total number of available 
sensors (N of which are in fact activated), the (integer part of 
the) expected number of sensors v = [E(iV)J, and so forth. In 
order to define a proper asymptotic setup, the precise sense in 
which the random N diverges must be defined. A general and 
convenient formalization is to assume that, as the parameter v 
goes to infinity 



N 

> R in probability, 

v 

where R is a positive random variable, see 



(10) 



A. Relevant EVT background 

Before illustrating the main asymptotic theorems, we briefly 
summarize some relevant facts from the classical literature. Let 
Y\ , Y2 , . . . , Y n be a collection of iid random variables with 
unbounded support, and let M n = maxfc<„ 

LEMMA 1 (Attraction) Under mild regularity conditions, there 
exist sequences of normalizing constants a n , b n such that 



lim F Mn (a n x + b n ) = G(x) 



(11) 



esis Hj, and by f Mn {x;Hj), f M + (x;Hj) and f M - (x\Hj) 



where G(x) is either the Gumbel distribution or the Frechet 
distribution. o 
The technical regular conditions can be found in any textbook 
on EVT (e.g., l29j ) and the relevant features of the quoted 

2 Here we formally consider v as an integer parameter, thus the u th element 
of a generic sequence {a n }^L 1 can be denoted as a v . However, all the 
asymptotic results of this work also hold when v is real, and a v is replaced 
by a L „j. 
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distributions are as follows [29|0 

GUMBEL 

G(x) = e~ e , — oo < x < oo, 
1\ 1 



bn = Fy 1 

lim (bn/dn) = oo. 



nf Y {b„ 



Frechet 



G(x) 





exp(— x~^) x > 0, 
1 



x < t _ 



6„ = 0, a„ = F y 1 ( 1 - - 
lim a„ = oo. 

n— too 

An extension of Lemma 1 to the case of random number 
of variables has been proved by Galambos [28 1: 

LEMMA 2 (Attraction with random number of variables) Let 
N be an integer random variable and let, as v — > oo, N/v 
converge in probability to a positive random variable R. If 



lim F Mn (a n x + b n ) = G(x), 



(12) 



then 



lim F MN {a v x + b v ) =E(G R (x)) (13) 

v— >oo 

where the expectation is taken under the distribution of R. o 

DEFINITION 2 (Right/Left tail dominance) Given a random 
variable Y, consider the limit 



lim 



l-F Y (x) 



(14) 



x->oo Fy{ — x) 

We say that Y is right-tail dominant if the above limit is +oo, 
and we say that Y is left-tail dominant if the limit is zero, o 

B. Detection asymptotic properties 

Let us explain the rationale of the modulus ordering, and 
for sake of simplicity assume the case in which the number 
of sensor is deterministic N = n. Suppose n "very large", 
local decision of the firing sensor (@) is based either on the 
largest or on the smallest — M.^ of the n transformed 
samples collected by the system. However, if the right tail 
of the distribution fz(x;Hi) dominates over the left tail 
(i.e., the right tail is heavier, or decreases slower) then with 
high probability the local decision is made using the largest 
transformed sample collected by the network. The converse 
happens with left-tail dominant distribution. See Fig. [T] for an 
instance of this effect. 

In practical problems it is often the case that fz(z;Ho) is 
left dominant while fz(z; Hi) is right dominant (or viceversa), 
so that the hypothesis test can be thought as one comparing a 
very large positive sample against a very small negative value. 
Based on this argument, one expects that the error probability 
can be made smaller and smaller as n grows. Here below this 
intuition is verified and the sense in which the errors can be 
controlled is made precise. 

3 To avoid confusion, notice that the assumption of unbounded support 
rules out convergence to the third class of attraction, namely the Weibull 
distribution. 
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Fig. 1. Illustrative sketch of the intuition behind this work: Under Hi, the 
pdf of Mn is closer and closer to that of Mi, as n grows, and the small 
peak located around the peak of the pdf of — Mn tends to vanish. Conversely, 
under Hq, the pdf of Mn approaches that of —Mn- 



THEOREM 1 (Asymptotics of MO detection statistic) Consider 
an MO network with a random number of active sensors N. 
Suppose that 

i) The random variables Z\, Z2, . . . have unbounded sup- 
port. 

ii) M-n Is attracted under %i with normalizing constants 
a+ and &+, and limiting distribution G + (x). Similarly 
M~ is attracted under Hq with normalizing constants 
a~ and b~, and limiting distribution G~(x). 

Hi) The ratio N/v converges to a positive random variable 
R\ under H\, and to a positive random variable Rq 
under Ho- 

iv) The random variables Zi's are right-tail dominant under 
Hi and left-tail dominant under Ho- 

Then 



M N 
M 



-> 1, MN-Mtr^O, under Hi, 



(15) 



TV 



and 



M 



N 



M 



-1, M 



N 



M 



N 



0, under H , (16) 



all the convergences being in probability. o 
Proof: The proof is deferred to Appendix [A] 

We want to stress that conditions i), ii) and iv) are by no 
means restrictive and hold true in a large number of practical 
applications. Condition Hi) is a convenient way to handle with 
networks of random size and encompasses as special case the 
scenario of nonrandom N. The claim of the theorem, in words, 
states that the detection statistic M.^ tends (asymptotically) 
to behave like VW^ under Hi, and like — under Ho, see 
again Fig. Q] As a direct consequence of Theorem 1, we get 
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the following. 

COROLLARY (Attraction of the detection statistic) Under the 
same assumptions of Theorem 1 

lim F Mn (a+x + 6+;Hi) = H+{x), (17) 



and 



where 



lim Fms ( a u x - K ; Ho) = H (x) (18) 



H+(x) = e((G+(x)) Ri ), 
H~(x) = l-E((C?-(-x)) flo ) 



(19) 
(20) 



Proof: Assume that Hi is in force. In view of Theorem 1 
it follows that (A4n — bt ) l a t converges in probability to 
(■Mpf — bt) /at ■ Assumption ii), along with a direct appli- 
cation of Lemma 2 implies the convergence in distribution 
of {M-ti — bt) /at- Then, the convergence in distribution of 
(.Mat — bt) /at claimed in ( fl7l ) follows by a direct applica- 
tion of Theorem 2.7 in [ 30 1 . The proof for Ho is similar. • 
As to the performance of the hypothesis test, this is 
expressed in terms of the false alarm and miss detection 
probabilities 

a v = P(decideHi;H ) = HM N > -y v ;H ), (21) 
(3 U =P(decideH ;Hi) = ¥{M N <j v ;Hi). (22) 

We note explicitly that, being N random, the threshold of the 
test cannot be set as a function of that, but rather it must be 
controlled by the parameter v, that is clearly assumed known 
in order to fix the threshold value. 

We consider the classical setup where a prescribed (asymp- 
totic) false-alarm level a is imposed, while it is required that 
vanishes with increasing v. In the light of Theorem 1, it is 
reasonable to impose an asymptotic false-alarm level a based 
on the asymptotic distribution under Ho- We indeed know that 
(Ai N + b~)/a~ converges in distribution toward H~ (x). This 
implies that the threshold 7^ = a~"f — b~ , with 

« = l-ff-(7) =E({G-(- 1 )) RoS 

achieves the asymptotic false alarm a. 

For this computation, it is useful to define the false alarm 
a corresponding to a system with deterministic number of 
sensors, that is 



G-(-7) = 



1 = 



loglog(l/5) GUMBEL 

- (log(l/5))"t FRECHET 

(23) 

The required false alarm a can be computed as a function of 

5, by 

a : E(a Ro ) = a. (24) 

The threshold j v = 0^7 — b~ is selected by using the 
asymptotic "Ho-distribution H~{x). An alternative might be 
that of imposing the strict equality a v = a for any finite v. 
This, however, would require exact knowledge of the cdf of the 
detection statistic for any finite v, which is usually unavailable. 

On the other hand, it is possible to use the asymptotic 
"similarity" (under Ho) between A4n and — Ai^ to set a new 



threshold as (1 — Fz ( r )v]Ho)Y — 5, which can be shown 
to achieve asymptotically the desired false-alarm level. These 
results are summarized in the following theorem. 

THEOREM 2 (MO consistency) Under the assumptions of 
Theorem 1: 

i) The nonparametric setting 7„ = ensures that 



Ot„ + 0v -> 0. 

ii) Let the detection threshold be either 

lv = a,77 - K, 
where 7 solves 1 — H ~~ (7) = a, or 

lu = F z 1 (l-a^,H ] 

where a solves E (a R " ) = a. Then: 

u v — > a, and /?„ — > 0. 

Proof: The proof is deferred to Appendix [B] 



(25) 
(26) 
(27) 
(28) 



Remark. The threshold setting used in eq. (1251 1 does not require 
any a-priori knowledge of the statistics, that amounts to a 
nonparametric threshold setting. This may be convenient in 
practical applications where limited knowledge of the statisti- 
cal model is available to the remote nodes. 

The above theorems are valid for a general MO network. 
For specific detection problems and/or local transformations, 
more powerful results might be obtained. This is the case of 
an £-MO strategy (namely, when Zi is the log-likelihood of 
the observations) applied to the shift-in-mean problems of the 
kind 

fx{x;H ) = <t>(x + e ), fx(x\Ui) = <j>(x - 0{) (29) 

where <j>(x) is an even function, <f>(x) > Vx, 9o > and 
6\ > 0. For this scenario we prove the following 

THEOREM 3 (l-MO properties) Assume that the local log- 
likelihoods fulfill conditions i), ii) and Hi) of Theorem 1. 
Then, condition iv) is automatically verified, and the results 
of Theorems 1 and 2 apply. In addition, if N is independent 
of the observations, the following upper bound on the miss 
detection probability holds 

Pu<e<». (30) 

o 

Proof: The proof is deferred to Appendix [C] 

IV. Applications 

To illustrate the above results, we now focus on sensor 
network applications. Both MO and ^-MO systems are in- 
vestigated for different case studies, with the twofold goal of 
providing a numerical check for the asymptotic convergence 
claimed in the theoretical results, and of investigating the effect 
of a moderately small number of sensors. We also consider 
networks of random size and a typical application example 
from the distributed detection domain, such as censoring 
sensor systems. Finally we address the case where the random 
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Fig. 2. MO transmission policy. Error a n + /3„ for the Gaussian nonpara- 
metric example, with cr = 1 and different combinations of 9o and 0\. The 
solid curves are obtained by numerical integration based on eq. (8). Dashed 
lines refer to clock offset discussed later, in Sect. IIV-EI It is shown the effect 
of synchronism errors for A c ik = 0.1, 1, 2 with respect to both the nominal 
case O = 01 = 1 and 9 = 1.5, 0j = 2. 
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Fig. 3. £-MO transmission policy. Error probabilities ct n and n for the 
Gaussian example with known parameters. The curves are obtained by numer- 
ical integration based on eq. (8j, and are parametrized in S NR= (6>i + 0o)/<r. 
Panels (a) and (b) refer to the threshold setting given in J26I , while (c) and 
(d) refer to that in {27). In the lower plots are also shown, as dashed lines, 
the miss detection bounds )30t . 



network size depends upon the observations, and we briefly 
touch upon the robustness of the detection system to timing 
offsets. 

A. Gaussian observations 

Let us start by considering the following Gaussian observa- 
tion model (Af(a, b) is our shortcut for a Gaussian distribution 
with mean a and standard deviation b): 

H Q : Xi~Ar(-0 o ,<T), vs. U x : X t ~ M{6 1: a), (31) 

where 9q, B\ and a are positive parameters, and the number of 
sensors n is deterministic (namely, here we set N = v = n). 
Let us consider first the MO policy with T(x) = x. According 
to Theorem 2, a n — > and j3 n — > 0. This is true even if 6q, 9\, 
a and n, are all unknown. In this case, we are faced with a fully 
nonparametric test in which the sensors have no knowledge of 
the parameters appearing in (|3"Tj and they accordingly use a 
zero threshold, see Theorem 2, part i). The results are shown 
in Fig. |2] where the corresponding error probabilities (solid 
curves) have been obtained by numerical integration based on 
expression ©: as predicted both the error probabilities go to 
zero, with a rate that depends upon the system parameters. 
The dashed curves refer to the effect of clock offsets, and we 
comment on this later. 

With reference to the same observation model (13TV sup- 
pose now that the parameters are known and that the f.-MO 
policy is in order. It is clear that the £-MO policy cannot 
be implemented without the knowledge of the distribution 
parameters, since it requires the computation of the likelihood. 
We assume again that n is deterministic. This case lies in the 
application domain of Theorem 3 and the error probabilities, 
still computed by numerical integration based on ((8), are 
illustrated in Fig. [3] It is worth noting that the asymptotic value 



is approached faster for larger values of SNR=(f?i + 0q)/ct. 
This should be expected because, as it can be easily seen, the 
tail dominance is "stronger" when the SNR grows. Panels (a) 
and (b) refer to the threshold setting given in (l26l . while (c) 
and (d) refer to the threshold in (l27t . We see that a„ converges 
to the desired asymptotic value (set to a = 10~ 2 in the figure); 
however, in (c) the convergence is somehow faster than that 
in (a), suggesting that the threshold setting (|27| | provides, in 
this example, some advantage. As claimed in Theorem 3, we 
see that /3 n — > as shown in panels (b) and (d). Note that the 
curves in (b) and (d) are very similar, which reveals that the 
selection of the threshold between the two alternatives, is not 
critical with respect to /3 n , Also shown is the upper bound on 
miss detection probability given by (f30b , that in this case (and 
for both the thresholds) can be approximated by the simple 
expression f3 n < exp(— SNR-\/2 log n), after neglecting terms 
of higher order in n. 

B. Networks of random size 

The powerfulness of the theorems presented in the previous 
section allows us to consider the more general setting of 
network of random size 7Y. Consider hence the following 
scenario. 

• Sensors are randomly deployed in a two-dimensional re- 
gion A, according to a non-homogeneous Poisson field. The 
intensity function of this field is A(a;), x G A, such that the 
average number of sensors in the region A is J xeA X(x)dx. 

• Some sensors are impaired before (or at) the act of commu- 
nication. Thus, the number of active sensors iV is a subset 
of those globally available. The probability of a failure is 
unknown to the network, and is accordingly modeled as a 
random variable Q, independent of the deploying process. 

• Conditioned on Q = q, the active sensors are selected 
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Fig. 4. £-MO transmission policy with networks of random size and Gaussian 
observation model. The values of a n and 0„ are plotted, for different values 
of SNR= (61 + 80) /a and different values of A. The curves labeled with 
"theoretical" are shown for comparison and refers to N deterministic, while 
those with A = refer to the simple Poisson model with deterministic mean 
value. 



independently and with probability q from the total number 
of sensors available in A. Given Q — q, the number of active 
sensors becomes a Poisson random variable with mean value 

1 IxeA X ( x ) dx - 

• Accordingly, the average number of active sensors is 



E(N) = E(Q) 



X(x)dx :- 



xeA 



By introducing the normalized random variable R = Q/K(Q), 
we have 

F(N = n\R = r) = ^J—e~ vr . 

• We focus on the asymptotic regime of increasingly large 
sensors, which in this context is formalized by v — > 00. From 
a practical perspective, note that an increasingly large value 
of v may be due to an increasing large sensor density \(x), 
that corresponds to the asymptotic regime of a dense network 
(recall however that sensors' observations are iid), or to an 
increasingly large surveyed region A, corresponding to the 
asymptotic regime of a large network. 

• It is easy to show thaO N/v R in probability, when 

v — > 00. 

As an example, let us consider again the observation model 
in d3lT i with ^-MO transmission policy, but assume now that 
the effective network size N is random according to the model 
described above, and suppose that Q = K(Q) + U, with 
U ~ W(— A/2, A/2), where U(a, b) stems for the uniform dis- 
tribution with support (a, b). In Fig.|4]the false alarm and miss 
detection probabilities, parametrized in SNR= (6i+9 ) / a, are 
shown for different values of A, with E(Q) = 0.5. The curves 
are obtained by means of Monte Carlo computer experiments, 

4 In fact, f E (|~~ T '| > e l^ = r ) — > by the weak law of large numbers. 
Convergence in probability of the unconditioned random variable N to R 
easily follows by Lebesgue dominated convergence theorem. 



Fig. 5. Censoring transmission policy with networks of random size and 
non-Gaussian observation model. The values of a n and j3„ are plotted, for 
two values of a and different values of the randomness index A. The curves 
with A = refer to the simple Poisson model with deterministic mean value. 



except those labeled as "theoretical". These, plotted for com- 
parison, refer to the case of N — v deterministic and are 
obtained by numerical integration. 

We see that the false alarm probability a n converges to 
its limiting value a — 10 _1 , with a convergence rate that is 
faster for larger SNRs. The same is true for the convergence 
to zero of (3 n shown in the lower plot. It is also worth noting 
that the limit value of a n is approached faster in the case 
of higher SNR, and almost at the same rate for random 
and A^ deterministic. When the randomness grows, namely A 
becomes larger, we see that for the miss detection probability 
the convergence is slightly slowed down. 

C. An example with censoring 

Consider again a network of random size, as described 
in Sect. IIV-BI but let us explore an example in which the 
transmission policy is based on a censoring strategy. Censoring 
techniques are commonly implemented in WSNs working 
under severe communication constraints, and amount to dis- 
card sensors' observations considered poorly informative for 
the detection purpose, see, e.g., [ 1 1 — [ 1 3 1 , (3T|. This can be 
obtained by selecting the transformation for the transmission 
policy according to the censoring rule: 



T(x) 



if 
if 



\x\ < 



where 9 C > is the censoring threshold. 

To enrich the example, we adopt an observation model 
different from the Gaussian one considered so far. Specifically, 
assume that under Hi the sensors observe Xi = Wi while 
under Ho they observe Xi — —Wi. The random variables 
Wi's are iid with pdf given by a mixture between a Gaussian 
and a Pareto density: 



1 



2 /9,r 2 / \ /X\- b - 1 , 



p 



V2~T 



where u(x) is the unit step function and < p < 1. In the 
computer experiments we set C = = b = 1, p = 0.5, 
and E(Q) = 0.7. The detection errors a n and /3„, computed 
by Monte Carlo simulations, are displayed in Fig. for two 
values of a and different values of A. The general behavior is 
similar to that of Fig. [4] in particular the convergence is faster 
when there is less randomness in the system, as quantified by 
the value of A. 

D. Observation-dependent network size 

The previous examples demonstrate the large versatility of 
the theorems provided in Sect, [ill] that ensure the asymptotic 
convergence of detection tests under a very broad class of 
applicative scenarios of practical relevance, including different 
transmission policies, different observation distributions, and 
very general network models. We now go even further by 
letting the random network size N to be dependent upon the 
sensors' observations — a possibility well encompassed in the 
theorems of Sect. [HI] 

Let, as usual, the i th sensor of the network monitor the 
physical phenomenon of interest by collecting the sample Xi. 
Suppose further that Xi = Si + Wi, where the random variable 
Si models the intrinsic state of the observed phenomenon, 
while the random variable Wi models the sensor measurement 
process; these two components are mutually independent and 
independent across sensors. 

As a distinct feature of this new scenario, we assume that 
the number of samples collected by the system is dependent 
upon the nature of the observed phenomenon, in such a way 
that the monitoring stage is ended at a certain random sample 
numbei0: 

N : inf {n : <p{Si,S 2 , . . . ,S n ) > v} , <p(-) > 0. (32) 

To fix ideas, </?(•) might be thought as a measure of the energy 
emitted by the surveyed physical system, which is assumed to 
be limited. In our asymptotic framework, we are interested 
in increasingly large values of N, and this explains why the 
threshold in (l32l has been just set to v. 

Consider for example a Gaussian shift-in-mean problem, 
with Si ~ Af(~9o,a s ) under Ho and Si ~ Af(0i,a s ) under 
Hi, and with Wi ~ Af(0, a w ) under both hypotheses. Assume 
also that the stopping rule for the acquisition process is N : 

inf {n : YT l= i S i > v ) . where ¥>( s i> s 2, • • • , s„) = £)?=i sf 
quantifies an energy expense (but for a normalization factor). 
By the theory of renewal processes 11321 . we know, for j = 0,1 



(a) 



N 



1 



under H 



(33) 



where the limit is to be intended with probability one. Note 
further that the parameter v is still related to the expected 
number of sensors. Indeed we also have ||32l , for j = 0, 1 



E(N) 



1 



under Hi 



(34) 



5 The random "time" defined in eq. (32) is by construction a Markov time, in 
that the event N = n is determined by the observations of the first n samples. 
Moreover, it will be a stopping time, provided that P(JV < oo) = 1. 
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Fig. 6. (-MO transmission policy for a network whose size depends upon the 
observations. Top plot (valid for both Wo and Hi) shows several realizations 
of N/ v (tiny curves) to illustrate the convergence with probability one in 433t . 
along with E(7V) /v that converges to the same limit according to j341 . Lower 
plots show the error p rob abilities for an asymptotic false alarm a = 10~ 2 , 
with threshold set by i27) . 



We apply the proposed one-bit detection strategy to the 
above situation with 0q = 6±, <x s = a w — 1 and SNR= 
(9i + 0o)/ y/&s + °ur The pertinent results are displayed in 
Fig. [6] for an asymptotic false alarm probability a = 10~ 2 , 
with threshold set by eq. ( f2Tb and using an ^-MO policy. In 
Fig. |6] (a) we show the convergence of the stopping number 
N as given in d3~3l and d34b . In Fig. |6] (b) and (c), we show 
the behavior of the error probabilities. 

E. Resilience to clock offset 

All the cases addressed above fall in the assumption of our 
theorems. As last example, we want to investigate briefly the 
robustness of the results with respect to models that slightly 
deviate from the formal assumptions of the theorems presented 
in Sect. [HI] In particular, we have assumed so far that sensors 
are perfectly synchronized, and one should note that the 
transmission policy strictly relies on such assumption. What if 
time references of the sensors are slightly misaligned? To make 
things simple, suppose that the clock of the generic sensor i is 
perturbed by U{ ~ U(—A c ik/2, A c ;fc/2) that models the tim- 
ing offset. In other words, sensor i will attempt to transmit its 
local decision at the time instant 1 / \T(Xi) \ + Ui. Let us refer, 
for simplicity, to the Gaussian shift-in-mean example provided 
in Sect. IIV-AI The dashed curves in Fig. [2] show the effect of 
timing errors with different A c ik = 0.1,1,2, with respect to 
the nominal cases 0q = 0i = 1 and 0o — 1.5, Qi = 2. 

As it can be seen, the test consistency seems to be preserved, 
thus evidencing a certain robustness of the proposed strategy. 
On the other hand, and perhaps unsurprisingly, by increasing 
the offset error A c /^, the performance worsen in the sense 
that the rate of convergence is slower, as consequence of the 
fact that the firing sensor may be different from the largest in 
modulus which conveys the largest information. Quantifying 



9 



the effect of A c ik on the convergence rate and understanding 
whether a certain A c ik exists such that asymptotic convergence 
of the errors is lost, remain open problems. 

V. Summary 

Distributed detection in large wireless sensor networks can 
be performed by the transmission of a single bit, exploiting 
the idea of ordered transmission policies. After casting such 
problem in a precise mathematical framework, we propose an 
easy-to-implement distributed statistical test whose asymptotic 
consistency is formally proved: Both the error probabilities 
can be controlled in the asymptotic regime of large network 
size, under a very broad class of observation models — 
from classical Gaussian shift-in-mean to fairly more general 
measurement settings — and applicative domains, including: 
nonparametric tests, likelihood-based transmission policies, 
censored systems, random network size, and observation- 
dependent sensor number. 

Appendix A 
Proof of Theorem 1 

We shall work under Hi, thus proving eqs. (15[ , and consis- 
tently skip the explicit dependence upon the hypothesis for no- 
tational ease. The proof of eqs. ( TToT l follows straightforwardly. 
Let us introduce the sequence of events £ v — {M.^ > -Mjf}. 
In view of the definition of the detection statistic M.n, the 
claim of the theorem will be certainly true if P(£„) — > 1. In 
order to show that this convergence actually takes place, let 
us elaborate as follows. Let < xq < x\ < +oo such that 



\RG [sso.zi]) > 1-e, 



(35) 



where, we recall, N/ v converges in probability to R. We are 
now legitimate to write 



UV) > I | M+ >M N A e >,,..n 



> F (m+ > M" , j e [a*,, xi]\ (36) 

where we have defined rij — [vxj}, j = 0,1, and the 
last inequality follows by obvious properties of maxima and 
minima. Assume for now that 



lim P(M+>M-) = 1. 



(37) 



This would imply that the last limit in eq. ( 1361 ) equals 



'(Re [aJo,Xi]) > 1-e 



fN r 

lim P — G [Xq, X\ 

where the inequality follows by eq. d35l l. Inequality (l36l l. e 
being arbitrary, implies liminf^oo P(£„) = 1, and hence 
P(£„) — > 1 as v diverges. 

It remains thus to show that eq. (l37| > holds. To this aim, it is 
expedient to work in terms of the normalized variables A4„ = 
(M+o - &+)/< and M- t = (M" - 6+ )/<■ Note first 
that, by assumption m), it is straightforward to conclude that 



lim I 

v— >oo 



(m+ < x) = (G + (x)y , where r) = (38) 



Furthermore, by assumption iv) we know that the Z/s are 
right-tail dominant under which implies (see ||33l , proof 
of Theorem 2.1), for all x with G+(x) ^ and G+(x) ^ 1, 
that _ 

p(a<- <x) ->■ 1. (39) 

It is convenient to study separately the different admissible 
attraction domains. Let us first consider the case that G + (x) 
is a Gumbel distribution. We have 

p ( ML > M: 



> P [MZ - M~ x > 0, M~ x < -6 

> v(m+ +S>0,M^ <S) (40) 

where 6 > is arbitrarily large. On the other hand, eq. (|39l , 
along with eq. (|38l implies 

lim P (M+ +6>0, M~ < -S) = 1 - e - " ^ , (41) 

yielding, in the light of eq. (|40t and being S arbitrary, 

liminf P (M+ >M- 1 )= 1. 

Let us switch now to the case that G + (x) is Frechet 
distributed. We first note that A4~ now vanishes in probability. 
Indeed, thanks to eq. d39l ), we have P (M^ > ej — >• and 



(M"<-e) < P ( V(- (i 



(l-Fz(O))" 1 ^0, 



having used the fact that, for the Frechet domaiiiof attraction, 
b+ = 0. Moreover, by eq. (l38l . the sequence converges 
to a Frechet random variable. Slutsky's theorem allows to 
conclude that the sequence M.^ — M.~ converges in distribu- 
tion to a non-negative random variable, implying the desired 
result ([37J. • 

Appendix B 
Proof of Theorem 2 

Let's start with part i), and accordingly consider the term 

P„ = F(M N <Q;Ui). (42) 

Now, for the case that G + (x) is Gumbel, by the convergence in 
distribution of (A4 N — b+)/ <z+, and the divergence of the term 
&+ / at' we desume that A4 n /b„ —¥ 1 in probability, implying 
P v -> in the light of eq. d42j. For the case that G + (x) if 
Frechet, we know that A^Ar/a+ converges in distribution to 
H + (x), which is supported on x > 0, and again /3„ — > 0. 
Similar reasoning will lead to a v — > 0. 

Let us switch to the part it), and consider first r y u as in 
eq. ( |26] |. For this case convergence of a u toward a is nothing 
but eq. (fT8l . Let us move to 7„ defined as in eq. (f2Tb . The 
attraction properties of M.^ imply 

P ( Mn I K < -> G-(- 7 ) - a 



where the last equality follows by eq. (123t . On the other hand, 
by the definition of the refined threshold j n in eq. ( 1271 ) 
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In view of Lemma 11.2.1 in [34|, this allows concluding that 
(in + Ki )/ a n !■ By eq. ( f]~8T > we have thus 



= P(M N >-y v ;H ) 



M 



N 



> 



; H( 



E (5 flo ) = 



the last equality following by eq. (l24l >. 

Let us now switch to the analysis of (3 V . Note that the 
threshold 7„ in eq. d26| i is negative, at least for sufficiently 
large v. Indeed, if G~(x) is Gumbel, b~/a~ — > oo and 
a~ > 0; if G~(x) is Frechet, 7 < 0, being the support of 
1 — H~(x) confined to the negative axis, see eq. (120) . Simple 
inspection show that the threshold 7„ in eq. (f2Tb is as well 
negative, at least for sufficiently large v. Thus, for sufficiently 
large v and for both choices of the thresholds one can write 
fi u < P (A4 n < 0; "Hi), and the proof is now complete. • 

Appendix C 
Proof of Theorem 3 

Let us first check the validity of condition iv) in Theo- 
rem 1. The symmetry of the function <fi(x) imply, for the 
considered shift-in-mean problem, fz(x;Hi) = fz(—x;Ho). 
The well-known nesting rule for the log-likelihoods ll35l 
further gives log ^/^'V) 1 ] — x. Combining the above re- 
suits gives fz(x;H ) = fz(-x;H )e~ x , and j z (x\U x ) = 
fz(—x;Hi)e x , which clearly implies that Z is right-tail 
dominant under Hi and left-tail dominant under Ho- 

Let us now prove eq. (13 Oi l. To this aim, we write the log- 
likelihood ratio of M. n 

£~=i fM n (x;H 1 )V(N = n) 



log 



EZi Sm Ax; Ho) P (N = n) 

fz(x;n 1 )J2n=inh n 1 - 1 (x)P(N = n) 



log 



fz (x; Ho) EZi n K~\*W (N = n) 

n-l, 

> . 1 1 1 1 

log 



Eoo 
n=l 



nh r {- L (x)P(N = n) 



En=inK-\x)P (N^n) 

where in the last equality we again applied the nesting rule. 
Moreover, it is easy to check that, in the shift-in-mean case 
with even 4>(x), we have h\{x) = ho(x), finally yielding 



log -b — —, — rr4 = x. At this point we are legitimate to use 
the Chernoff bound: /3„ = P (M N < lu\ %i) < e 7 '7 and the 
proof is complete. • 
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